304 research outputs found
Optimizing MapReduce for Multicore Architectures
MapReduce is a programming model for data-parallel programs originally intended for data centers. MapReduce simplifies parallel programming, hiding synchronization and task management. These properties make it a promising programming model for future processors with many cores, and existing MapReduce libraries such as Phoenix have demonstrated that applications written with MapReduce perform competitively with those written with Pthreads. This paper explores the design of the MapReduce data structures for grouping intermediate key/value pairs, which is often a performance bottleneck on multicore processors. The paper finds the best choice depends on workload characteristics, such as the number of keys used by the application, the degree of repetition of keys, etc. This paper also introduces a new MapReduce library, Metis, with a compromise data structure designed to perform well for most workloads. Experiments with the Phoenix benchmarks on a 16-core AMD-based servershow that Metisâ data structure performs better than simpler alternatives, including Phoenix
Whanaungatanga: Sybil-proof routing with social networks
Decentralized systems, such as distributed hash tables, are subject to the Sybil attack, in which an adversary creates many false identities to increase its influence. This paper proposes a routing protocol for a distributed hash table that is strongly resistant to the Sybil attack. This is the first solution to this problem with sublinear run time and space usage. The protocol uses the social connections between users to build routing tables that enable Sybil-resistant distributed hash table lookups. With a social network of N well-connected honest nodes, the protocol can tolerate up to O(N/log N) "attack edges" (social links from honest users to phony identities). This means that an adversary has to fool a large fraction of the honest users before any lookups will fail. The protocol builds routing tables that contain O(N log^(3/2) N) entries per node. Lookups take O(1) time. Simulation results, using social network graphs from LiveJournal, Flickr, and YouTube, confirm the analytical results
Retroactive auditing
Retroactive auditing is a new approach for detecting past intrusions and vulnerability exploits based on security patches. It works by spawning two copies of the code that was patched, one with and one without the patch, and running both of them on the same inputs observed during the system's original execution. If the resulting outputs differ, an alarm is raised, since the input may have triggered the patched vulnerability. Unlike prior tools, retroactive auditing does not require developers to write predicates for each vulnerability.United States. Defense Advanced Research Projects Agency. Clean-slate design of Resilient, Adaptive, Secure Hosts (Contract number N66001-10-2-4089)National Natural Science Foundation (CNS-1053143
CPHASH: A cache-partitioned hash table
CPHash is a concurrent hash table for multicore processors. CPHash partitions its table across the caches of cores and uses message passing to transfer lookups/inserts to a partition. CPHash's message passing avoids the need for locks, pipelines batches of asynchronous messages, and packs multiple messages into a single cache line transfer. Experiments on a 80-core machine with 2 hardware threads per core show that CPHash has ~1.6x higher throughput than a hash table implemented using fine-grained locks. An analysis shows that CPHash wins because it experiences fewer cache misses and its cache misses are less expensive, because of less contention for the on-chip interconnect and DRAM. CPServer, a key/value cache server using CPHash, achieves ~5% higher throughput than a key/value cache server that uses a hash table with fine-grained locks, but both achieve better throughput and scalability than memcached. The throughput of CPHash and CPServer also scale near-linearly with the number of cores.Quanta Computer (Firm)National Science Foundation (U.S.). (Award 915164
User-Relative Names for Globally Connected Personal Devices
Nontechnical users who own increasingly ubiquitous network-enabled personal
devices such as laptops, digital cameras, and smart phones need a simple,
intuitive, and secure way to share information and services between their
devices. User Information Architecture, or UIA, is a novel naming and
peer-to-peer connectivity architecture addressing this need. Users assign UIA
names by "introducing" devices to each other on a common local-area network,
but these names remain securely bound to their target as devices migrate.
Multiple devices owned by the same user, once introduced, automatically merge
their namespaces to form a distributed "personal cluster" that the owner can
access or modify from any of his devices. Instead of requiring users to
allocate globally unique names from a central authority, UIA enables users to
assign their own "user-relative" names both to their own devices and to other
users. With UIA, for example, Alice can always access her iPod from any of her
own personal devices at any location via the name "ipod", and her friend Bob
can access her iPod via a relative name like "ipod.Alice".Comment: 7 pages, 1 figure, 1 tabl
Improving application security with data flow assertions
Resin is a new language runtime that helps prevent security vulnerabilities, by allowing programmers to specify application-level data flow assertions. Resin provides policy objects, which programmers use to specify assertion code and metadata; data tracking, which allows programmers to associate assertions with application data, and to keep track of assertions as the data flow through the application; and filter objects, which programmers use to define data flow boundaries at which assertions are checked. Resin's runtime checks data flow assertions by propagating policy objects along with data, as that data moves through the application, and then invoking filter objects when data crosses a data flow boundary, such as when writing data to the network or a file.
Using Resin, Web application programmers can prevent a range of problems, from SQL injection and cross-site scripting, to inadvertent password disclosure and missing access control checks. Adding a Resin assertion to an application requires few changes to the existing application code, and an assertion can reuse existing code and data structures. For instance, 23 lines of code detect and prevent three previously-unknown missing access control vulnerabilities in phpBB, a popular Web forum application. Other assertions comprising tens of lines of code prevent a range of vulnerabilities in Python and PHP applications. A prototype of Resin incurs a 33% CPU overhead running the HotCRP conference management application.Nokia Researc
Operating system extensibility through event capture
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (p. 31).by Thomas Pinckney III.M.Eng
Reinventing Scheduling for Multicore Systems
High performance on multicore processors requires that
schedulers be reinvented. Traditional schedulers focus
on keeping execution units busy by assigning each core
a thread to run. Schedulers ought to focus, however, on
high utilization of on-chip memory, rather than of execution
cores, to reduce the impact of expensive DRAM
and remote cache accesses. A challenge in achieving
good use of on-chip memory is that the memory is split
up among the cores in the form of many small caches.
This paper argues for a form of scheduling that assigns
each object and its operations to a specific core, moving
a thread among the cores as it uses different objects
- …